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Abstract 



Use of Peer-to-Peer (P2P) service networks in- 
troduces a new communication paradigm be- 
cause peers are both clients and servers and so 
each peer may provide/request services to/from 
other peers. Empirical studies of P2P networks 
have been undertaken and reveal useful charac- 
teristics. However there is to date little analyti- 
cal work to describe P2P networks with respect 
to their communication paradigm and their in- 
terconnections. This paper provides an analyti- 
cal formulation and optimisation of peer connec- 
tion efficiency, in terms of minimising the frac- 
tion of wasted connection time. Peer connection 
efficiency is analysed for both a uni- and multi- 
connected peer. Given this fundamental optimi- 
sation, the paper optimises the number of con- 
nections that peers should make use of func- 
tion of network load, in terms of minimising the 
total queue size that requests in the P2P network 
experience. The results of this paper provide a 
basis for engineering high performance P2P in- 
terconnection networks. The optimisations are 
useful for reducing bandwidth and power con- 
sumption, e.g. in the case of peers being mo- 
bile devices with a limited power supply. Also 
these results could be used to determine when a 
(virtual) circuit should be switched to support a 
connection. 



1 Introduction 

A number of Peer-to-Peer (P2P) projects have 
been developed over the last few years and the 
use of the P2P communication paradigm is be- 
coming widely known A peer is a pro- 
cess which can connect to and accept connec- 
tions from other peers. Hence peers may pro- 
vide services and may request services from other 
peers. Classification of P2P technology is pro- 
posed in 

A large number of P2P specific example appli- 
cations have arisen: Napster 1 and Gnutella are 
well known and widely used file sharing systems 2 , 
Freenet 3 is an adaptive P2P network application 
that permits the publication, replication, and re- 
trieval of data while protecting the anonymity 
of both authors and readers [3], SETI@home is 
a well developed system for peers to distribute 
and process data units and PCSCW pro- 
vides a P2P based computer supported cooper- 
ative work environment. Other examples can be 
found in [TO1 □ H El HJl ■ 

Using TCP /UDP and sockets provides a stan- 
dard interface for connecting peers but is inse- 
cure. Secure connections, including peer authen- 
tication, between peers can be provided using 
SSL or some other form of secure communica- 



1 www . napster . com 

2 See opennap . sourcef orge . net for a long list of other 
P2P file sharing protocols. 

,j f reenet . sourcef orge . com 
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tion such as Kerberos. Secure connections take 
a longer time to establish but are required for 
certain kinds of activity such as resource sharing 
services where peers should provide only autho- 
rised access to local resources, e.g. when pro- 
viding direct access to the underlying operating 
system as a generic computer resource. 

With classic client /server systems, if the server 
cannot satisfy a client request then the client 
may redirect its request to another server by 
disconnecting and reconnecting. For example, 
the WWW defines connections between servers 
in the form of hyper-links which clients use to 
search for and access information. Further- 
more, until recent developments such as XML 
the WWW servers were not responsible for the 
connections, which lead to a significant fraction 
of their connections becoming invalid. 

The communication patterns that arise via the 
use of P2P are fundamentally different to clas- 
sic client /server. Peers may hop from peer to 
peer, disconnecting and reconnecting, in order 
to access a service. Also service requests may be 
passed from peer to peer, following paths deter- 
mined by the peers. Unlike a static service net- 
work such as DNS, connections between peers 
dynamically change, e.g. to avoid unnecessary 
intermediate peers. 

A number of empirical studies of P2P net- 
works have been undertaken. The study by 
Saroiu, Gummadi at. al. [2] gives detailed mea- 
surements of peer characteristics in the Nap- 
ster and Gnutella networks. They built crawlers 
to automatically sample peer characteristics in 
the live peer networks. Disappointingly, but 
not surprisingly, the results show that the be- 
haviour of users in a peer system may be cat- 
egorised as client-like and server-like. Approxi- 
mately 26% of Gnutella users share no data, 7% 
of Gnutella users offer more files than all of the 
other users combined and on average 60-80% of 
Napster users share 80-100% of the files. In gen- 
eral, about 22% of the participating peers have 
upstream (from peer) bottleneck bandwidths of 
100Kbps or less, which makes them unsuitable 
to provide content and data services. The me- 



dian session (connected) time is approximately 
60 minutes, corresponding to be the time taken 
to connect and download a small number of files. 
Further studies of P2P networks were reported 
in [HUH- 

1.1 Contribution of this paper 

This paper provides an analytical formulation 
and optimisation of the fundamental principles 
governing a peer connection and service provi- 
sion. In the first case a single connection is op- 
timised so as to minimise the total time spent 
connecting or connected but unused, given an 
average arrival rate of requests. In the second 
case multiple connections are optimised so as to 
simultaneously minimise the on time of each con- 
nection while minimising the queue lengths at 
each connection, given an average aggregate ar- 
rival rate. 

The optimisations are useful for reducing 
bandwidth and power consumption, e.g. in the 
case of peers being mobile devices with a limited 
power supply. Also these results could be used 
to determine when a (virtual) circuit should be 
switched to support a connection. 

2 Peer connection efficiency 

In the following sections the formulation consid- 
ers a peer which receives requests, either from a 
user or from some number of incoming connec- 
tions, and services these requests by establishing 
one or more outgoing connections to other peers, 
i.e. the peer connects to another peer in order 
to service each request as in Figure ^ 



incomming requests from 
user or from another peer 





peer establishes a connection 

to a neighbor to service the request 



Figure 1: Peer-to-Peer connection 
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2.1 Single connection 

Let t c be the time to establish a connection and 
t s be the time taken to complete the service once 
the connection has been established. Let re- 
quests arrive randomly with an average arrival 
rate A. For a given request, if the peer is con- 
nected (state 1) then the service time is si = t s 
otherwise the peer is not connected (state 0) and 
so the service time is sq = t c + t s . 

On the arrival of a request let the peer be not 
connected with probability p and connected with 
probability q = 1 — p. Then the average service 
time is 



E[s] = ps + qsi = pt c + t s 



(1) 



The M/G/l queueing system model is applica- 
ble with a mean service rate p = l/.E7[s] and 
utilisation p = ^. 

It follows that, for a fraction p of a given time 
interval, the peer is either connecting or the con- 
nection is being used to provide the service. For 
a fraction 1 — p of the same time interval, the 
peer is either disconnected or connected but not 
making use of the connection, the later called 
idling. One minus the fraction of time that the 
peer uses to connect minus the fraction of time 
that the peer idles defines the peer connection 
efficiency: 



r)(p;\,t c ,t s ) 



l-p 



Ptc 



pt c + t s 



(l-p)(l-p). (2) 



Conversely, not using the connection when it is 
not needed is considered efficient, as is not hav- 
ing to establish a connection every time the ser- 
vice is requested. 

As an intuitive example, consider a hard drive 
system that has two states: spinning and sleep- 
ing. If the hard drive is sleeping when a request 
arrives, then it must first spin up, taking time 
i c , before servicing the request, taking an addi- 
tional time t s . If the hard drive is already spin- 
ning, then the request is serviced in time t s only. 
Clearly the system aims to minimise the fraction 
of time spent spinning up and idling. 

This goal is trivial for A — > by setting p = 1 
and when A > j- by setting p = 0. In the first 



case the peer is not required and in the second 
case such a situation is not good because the 
service becomes unstable as the request queue 
grows to infinity. Maximising r] with respect to 
p yields an optimal value: 



d(l - 7?) 



dp 



2\pt c + Xt s - 1 



1 - At, 



P 



; < A < — . 



(3) 



2A^c tg 



However for small A, p* becomes > 1 so 



l-Xts 

2Xt c 



A < a, 

j- > A > a, 

A>f. 



(4) 



where a - 
Figure 



2ft 



shows p (series that tend to 0) and 
77 (series that tend to 1) as a function of nor- 
malised average arrival rate 4 for various values 
of t c and t s . The series for t c = 1, 2, 3, 4 are with 
t s = 1 and the series for t s = 2,3,4 are with 
t c = 1. Arrows point to the intersection of asso- 
ciated series. Figure 2(b)| shows the results of a 
simulation (given in Section EJ that uses p* from 
Equation |1J 

2.2 Multiple connections 

Each peer may open a number of simultaneous 
connections in order to satisfy service requests. 
Since a peer may satisfy a request transparently 
by simply passing the request to another peer, 
any peer may satisfy any type of request. With 
this assumption, a peer may take an incoming 
request and pass it to any other peer for service. 
A network of file mirroring peers gives an ex- 
ample where this assumption holds, every file is 
available on every peer. 

In this case, let requests arrive at the peer 
with an average arrival rate of A and let the 
peer maintain d (outgoing) connections, with 
each connection receiving an average Aj, i = 
0, 1, 2, . . . , d — 1, requests per second where 



E 



A, = A. 



'A takes the values (0, j-). 
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(a) theory 



(b) simulation 



Figure 2: Probability of disconnecting that gives optimal efficiency versus normalised average arrival 
rate for various values of t r and t,. 



In other words, the peer forwards a request to 
connection i with probability Aj/A. Let A = 
(Ao, Ai, . . . , Ad_i). Note that the peer must ob- 
serve the bounds < \i < j- to ensure that the 
queue lengths associated with each connection 
are finite. 

In the previous analysis the probability p that 
a connection is disconnected after servicing a re- 
quest was computed such that the wasted time, 
being the sum of connecting time and idle time, 
was minimised. Continuing, let pi be this prob- 
ability for connection i. From Equation [21 let 



& =Pi(l - Pi) 



be called the off-time of connection i where pi is 



the utilisation of connection i. Let t c 



t r and 



t s ,i = t s be invariant over all connections. When 
Pi is set to minimise the fraction of wasted time 
for a given Aj then it maximises the fraction of 
time available for both servicing the requests and 
for residing in the off state. 

When £j = the connection i is on for all time 
and conversely when £j = 1 the connection i is 
off for all time. For a given A,, the length of the 



request queue that connection i is servicing 5 is 



PI 



I- Pi 

where from Equation ^ 
Var[s, i] 



C 



I -Pi) 



E[s, 



(Pit c + t s 



is the squared coefficient of variation for the 
M/G/l queueing system. The quantity ^/L, 
is the fraction of time the connection is off per 
request. It follows that 



OPR(X;p*,t c ,t s ) 



J2i=o 6 



(5) 



is the average off time per request for the peer. 
Clearly the objective of the peer is to maximise 
OPR for a constant A. 

Numerical solutions, using simulated anneal- 
ing, that maximise Equation for the cases 
when d = 2,3,5,10 are shown in Figure with 
A 



(0, d 



In Figure 3(a)[ An = Ai for small 
A and similarly in Figures |3(b)| [3(c)1 and [3(d)] 
(however A's are difficult to distinguish and are 
not shown for this last case) . In Figure |3(f)| Ao 
and Ai are equal. 

5 Using Pollaczek's formula. 



4 




normalized average arrival rate normalized average arrival rate 

(a) d = 2,t c = \,t s = \ (b) d = 3, t c = \,t s = \ 




normalized average arrival rate normalized average arrival rate 



(e) d = 5, U = 5, t s = 1 (f) d = 5, U = 1, t s = 5 

Figure 3: Values for Aj that maximise OPi? over d possible connections versus A. 
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From Figure |31 it is seen that the peer should 
partition the set of connections into a set with 
a high utilisation and a set with low utilisation 
in order to achieve the objective. The vertical 
lines show when a Aj switches from a low value 
up to a high value. For certain regions where the 
solution involves a Aj = it is evident that less 
connections is better. 

Clearly if t c > t s then optimal conditions are 
reached by loading a small number of connec- 
tions and when t c < t s the connections may be 
equally loaded most of the time. Mirrored hard 
drives, which have spin up time much greater 
than the time to service a request would operate 



optimally according to Figure 3(e) Peer connec- 
tions tend to be established faster than the time 
to service the request so they would operate op- 
timally according to Figure |3(f)| 



2.3 Servicing a fraction of requests 

The peer is expected to service some fraction of 
the total A requests. Let Ao be the chosen rate 
of requests that the peer services as depicted in 
Figure 0J In this case Equation El is modified to 
obtain a service off time per request: 



SOPR(X;p*,t c ,t s ) 



L o + Ya=i L 'i 



(6) 



where Lq = Pq/(1 — po) and po = \ot s . There 
is no t c for requests serviced by the peer and 
no wasted time since the peer is on all the time. 



Figures 5(a) and 5(b) show how SO PR compares 




serviced by the peer 



Figure 4: A peer servicing Ao of the A requests 
and forwarding those remaining. 



to OPR for the case when d 
respectively. 



3 and d 



2.4 Expected number of peers tra- 
versed 

So far the time t s has been considered the time 
taken to service a request over an established 
connection. Consider t s to be the time taken to 
transmit the request. In this case, the request is 
transmitted from peer to peer until it is finally 
serviced. The probability of being serviced at a 
peer is 

Ao 

Ps= A 

and the average number of peers that a request 
passes through (including the last one) is then 
k = l/p s . 

The average queue length experienced by a re- 
quest as it passes through a peer is 



d-l 



A; 



q.l 



A- An 



and so the total length of queues experienced by 
a request in the P2P network is 



i=i 



Ps y- L (L q (i-i) + h 



where Lq is defined for SOPR. Average queue 
length increases in direct proportion to the num- 
ber of connections that each peer maintains. 
This suggests that having a small number of con- 
nections is better. However, because the con- 
nections are not bandwidth constrained, having 
more connections allows a larger load. Com- 
puting optimal connection efficiency given band- 
width constraints required dynamic program- 
ming techniques and is left for further research. 

3 Conclusion 

Deciding if and when to disconnect from a server 
in order to minimise the total wasted connecting 
plus idle connection time with respect to the ar- 
rival of requests is an interesting problem. This 
maximises the connection efficiency. This paper 
derived a probability p such that, for a given re- 
quest arrival rate A, if the client (blindly) discon- 
nects with probability p after servicing a request 
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(a) d = 3,t c = l,t s = 1 



(b) d = 5,t c = l,t a = 1 



Figure 5: Values for Aj that maximise SOPR over d possible connections versus A. 



then the connection efficiency is maximised. In 
this case the client has no knowledge of the queue 
size. It is seen that, for a connection establish- 
ment time t c and a request service time t s , the 
worst efficiency occurs when the request rate is 
about 2tc+t a • better efficiency could be obtained 
by considering the queue size (whether > or 
not) and other statistical information that de- 
scribes the arrival of requests. This aspect is left 
for further study. 

In a P2P network, peers can connect (like 
clients) to other peers (like servers) in order to 
service incoming requests. Using a broad as- 
sumption, peers can relay requests to other peers 
so that any peer can transparently service any 
kind of request. As each connection from a peer 
has an associated connection efficiency, which is 
a function of request rate, it follows that the av- 
erage connection efficiency over the peer's con- 
nections is dependent on the proportion of re- 
quests forwarded over each connection. To avoid 
solutions that give unbounded queue sizes the 
paper proposed a quantity called off time per re- 
quest and uses simulated annealing techniques to 
find an assignment of request rates over a given 
number of connections that maximises this quan- 
tity, thus minimising wasted time. 

Furthermore, the paper considered each peer 
to be servicing some fraction of requests that it 



receives, rather than forwarding these requests 
to other peers. In this case a similar quantity 
called service off time per request is introduced 
and maximised subject to similar conditions. 

It was seen that the forwarding of requests to 
peers should be divided into two general classes: 
a highly loaded class and a lightly loaded class. 
This intuitively follows from observing that the 
least efficient operating load for a connection, as 
a function of load, is between and maximum 
load and that a lightly loaded connection is less 
efficient that a heavily loaded connection. For 
more than a few connections it is seen that some 
connections are better left inactive until required 
to handle the load. 

Future research will study the delay (wait 
time) rather than queue length. Although de- 
lay and queue length are related as 

w, - $ 

the solutions for minimum service off time per 
request are significantly different. Thus, the sys- 
tem must choose to minimise W q or L q or a 
combination of both. Also, analytical solutions 
to the multidimensional minimisation problem 
have not be derived. The problem is a piece- 
wise non-linear minimisation problem with con- 
straints. The method of simulated annealing re- 



7 



quired a significantly large number of restarts in 
order to achieve the global minimum. 

The study of peer connection efficiency is an 
interesting area of P2P research that describes 
fundamental P2P modes of operation using rig- 
orous analytical theory. There is substantial 
grounds for further analysis and potential for 
new discoveries. 

A Simulation algorithm 

The following simulation was executed with T = 



5000 arrivals to obtain the results of Figure 2(b) 



1. given t c and t s and a uniform random vari- 
able X between and 1, 

2. for each A 

(a) compute p according to Equation 0] 

(b) Connected = false 

(c) Waste = 0.0 

(d) CurrentTime = 0.0 

(e) ArrivalsTime = 0.0 

(f) for T number of arrivals do 

i. Arrival = (-1/A) log(l - X) 

ii. ArrivalsTime = ArrivalsTime + 
Arrival 

hi. TimeDif ference = 
ArrivalsTime — CurrentTime 

iv. if TimeDif ference < then set 
TimeDif ference = 

v. if Connected = true then 

A. set Waste = Waste + 
TimeDif ference 

B. set CurrentTime = 
CurrentTime + 
TimeDif ference + t s 

vi. ELSE 

A. set Waste = Waste + t c 

B. set CurrentTime = 
CurrentTime + t s + t c 

vii. set Connected = false with prob- 
ability p and = true with proba- 
bility 1 — p 



(g) output 77(A) = 1 — 

Waste I CurrentTime 
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